Skip to content

feat: add URL-to-MulmoScript pipeline#102

Open
isamu wants to merge 6 commits intomainfrom
feat/url-to-mulmo
Open

feat: add URL-to-MulmoScript pipeline#102
isamu wants to merge 6 commits intomainfrom
feat/url-to-mulmo

Conversation

@isamu
Copy link
Copy Markdown
Contributor

@isamu isamu commented Feb 17, 2026

Summary

  • Add mulmo-slide url <url> command that generates MulmoScript from web article URLs
  • Fetch articles using Puppeteer + Readability via existing puppeteerCrawlerAgent from mulmocast
  • LLM-based content quality check to reject login pages, 404s, empty pages
  • Three expression styles: author (first-person), news (objective anchor), overview (concise summary)
  • Zod schema validation with up to 3 retries on LLM output
  • Optional --movie and --bundle flags to continue pipeline after MulmoScript generation

New Files

File Purpose
src/utils/expression-styles.ts Expression style definitions with LLM system prompts
src/actions/url-to-script.ts Main pipeline (fetch → quality check → LLM beats → validate → save)
src/cli/url-commands.ts CLI command registration with options
tests/test_expression_styles.ts 13 unit tests for styles and basename generation

Modified Files

File Change
src/cli/index.ts Register registerUrlCommands in reducer chain
package.json Add "url" script shortcut
AGENTS.md Add url command docs, yarn shortcut, platform requirements, test entry
README.md Add URL row to input table, full URL-to-MulmoScript section with usage/options/pipeline

Usage

mulmo-slide url https://example.com/article -e author -l ja
mulmo-slide url https://example.com/article -e news --bundle
mulmo-slide url https://example.com/article -e overview --movie -l ja -b 10

Test plan

  • yarn build passes
  • yarn lint passes (0 errors)
  • yarn test — 291/291 tests pass (13 new)
  • Manual test: yarn url <real-url> -e author -l ja generates valid MulmoScript
  • Manual test: yarn url <404-url> fails with quality check error
  • Manual test: --movie and --bundle flags trigger downstream generation

User Prompt

Web 記事から MulmoScript を自動生成し、movie/bundle まで一気通貫で生成するコマンドを追加する。mulmocast-cli から puppeteerCrawlerAgent を import して URL のテキスト抽出を行い、独自の「表現スタイル」(system prompt) で LLM にビートを生成させる。

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • URL-to-MulmoScript CLI: generate presentations from web article URLs with language, expression style, visual style, target beats, and optional movie/bundle outputs.
    • Three expression styles: author, news, overview.
    • New slide schema utility providing human-readable schema for prompts.
  • Documentation

    • "URL to MulmoScript" usage, CLI reference, examples, pipeline steps (also added to PDF docs).
  • Tests

    • Added tests for expression styles, slide schema, and basename/filename generation.
  • Chores

    • Added npm script to run the URL command.

Add `mulmo-slide url <url>` command that fetches web articles via
Puppeteer, checks content quality with LLM, and generates narrated
MulmoScript with configurable expression styles (author/news/overview).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 17, 2026

📝 Walkthrough

Walkthrough

Adds a URL-to-MulmoScript feature: CLI command and action to fetch web articles, validate content, generate MulmoScript with retries and style application, new expression-style and slide-schema utilities, README/AGENTS docs, npm script, and tests for styles and basename generation.

Changes

Cohort / File(s) Summary
Core URL-to-Script Processing
src/actions/url-to-script.ts
New end-to-end implementation: fetch article, quality check, LLM-based beat generation with retry and Zod validation, slide-style application, basename generation, and persist article.txt + MulmoScript.
CLI Integration
src/cli/url-commands.ts, src/cli/index.ts
Registers mulmo-slide url <url> command with options (language, expression, slide style, beats, force, movie/bundle); delegates to runUrlToScript and optionally runs movie/bundle generation.
Expression Styles Utility
src/utils/expression-styles.ts
New ExpressionStyle interface, shared beat instructions, three presets (author, news, overview), and exported EXPRESSION_NAMES.
Slide Schema Utility
src/utils/slide-schema.ts
New helper that builds and memoizes a markdown string containing JSON Schema outputs for slide layouts and content blocks for inclusion in prompts.
Tests
tests/test_expression_styles.ts, tests/test_slide_schema.ts
Adds tests validating expression style metadata and prompts, EXPRESSION_NAMES, basename generation edge cases, and slide-schema content + caching.
Docs & Agents
README.md, AGENTS.md
Adds "URL to MulmoScript" documentation, CLI reference entry, and maps URL workflow into architecture docs (some duplicated content noted).
Packaging & Scripts
package.json
Adds npm script "url": "tsx src/cli/index.ts url" and bumps mulmocast/@mulmocast/types ranges.
sequenceDiagram
    participant CLI as CLI User
    participant URLCmd as URL Command
    participant URLScript as URL-to-Script Action
    participant Puppeteer as Article Fetcher
    participant OpenAI as LLM
    participant Validator as Zod Validator
    participant FS as File System

    CLI->>URLCmd: mulmo-slide url <URL> -e author -l ja
    URLCmd->>URLScript: runUrlToScript(url, options)
    URLScript->>Puppeteer: fetchArticle(url)
    Puppeteer-->>URLScript: {title, textContent, dateStr}
    URLScript->>FS: check existing / save article.txt
    URLScript->>OpenAI: checkArticleQuality(text)
    OpenAI-->>URLScript: quality verdict
    URLScript->>OpenAI: generateBeatsWithRetry(prompt)
    OpenAI-->>URLScript: MulmoScript JSON
    URLScript->>Validator: validate JSON (Zod)
    Validator-->>URLScript: valid / errors
    URLScript->>URLScript: applySlideStyle(beats, style)
    URLScript->>FS: save MulmoScript
    URLScript-->>URLCmd: return scriptPath
    URLCmd->>FS: optionally run movie/bundle generators
    URLCmd-->>CLI: completion message
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 A URL hops in on a breezy trail,
Beats tumble out like a rhythmic tale,
Prompts stitch the scenes with tidy art,
Slides dress up the story — a rabbit’s part,
MulmoScript lands, and I thump with delight!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title 'feat: add URL-to-MulmoScript pipeline' accurately and clearly describes the main change: adding a complete URL-to-MulmoScript generation pipeline with CLI command, LLM-based processing, and downstream integration.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/url-to-mulmo

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
src/actions/url-to-script.ts (2)

64-96: Quality check result is not validated beyond JSON parsing.

Line 89 parses the LLM response as { sufficient: boolean; reason: string } but doesn't verify the shape. If the LLM returns valid JSON without a sufficient field, result.sufficient would be undefined (falsy), causing a false rejection with an undefined reason. Consider adding a guard.

Proposed fix
   const content = extractResponseContent(response);
   const result = JSON.parse(content) as { sufficient: boolean; reason: string };

-  if (!result.sufficient) {
+  if (typeof result.sufficient !== "boolean") {
+    console.warn("  ⚠ Quality check returned unexpected format, proceeding...");
+    return;
+  }
+  if (!result.sufficient) {
     throw new Error(`Article content insufficient: ${result.reason}`);
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/actions/url-to-script.ts` around lines 64 - 96, In checkArticleQuality,
after calling extractResponseContent and JSON.parse, validate the parsed object
shape (from the parsed result returned by extractResponseContent/response) by
ensuring result has a boolean property "sufficient" and a string "reason" before
using them; if "sufficient" is missing or not a boolean or "reason" is
missing/not a string, throw a clear Error that includes the raw content (or the
parsed object) so we don't reject articles due to malformed LLM output, and use
strict check (result.sufficient === true) when deciding pass/fail.

44-62: Fragile type assertions on the agent function call and result.

The as unknown as AgentFunctionContext cast (Line 54) and the result destructuring (result as { data?: ArticleData }).data (Line 56) bypass type safety. If puppeteerCrawlerAgent's API changes, these will silently break at runtime.

This is acceptable for now given the agent function's complex signature, but consider adding a runtime shape check on the result (e.g., verify data has url, title, textContent fields) to fail fast with a clear error.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/actions/url-to-script.ts` around lines 44 - 62, fetchArticle currently
uses fragile casts for the agent call and result; add a runtime shape validation
after receiving result from puppeteerCrawlerAgent.agent to ensure the returned
object has a data property with url, title and textContent (non-empty strings)
and throw a clear Error if validation fails; keep using agentFn =
puppeteerCrawlerAgent.agent but replace the unchecked cast and the simple
(result as { data?: ArticleData }).data access with a small inline
type-guard/validation that asserts result && typeof result === 'object' &&
result.data && typeof result.data.textContent === 'string' (and validate
url/title similarly) before returning data, so fetchArticle fails fast with a
clear message if the agent shape changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/actions/url-to-script.ts`:
- Around line 98-122: buildBeatsPrompt currently appends the entire articleText
into the LLM prompt (risking context-window overrun); change buildBeatsPrompt to
limit the article content included by either truncating articleText to a safe
max length (e.g., a configurable maxChars ~30000 or reusing the existing
3000-char quality truncation) or by implementing deterministic
chunking/summarization before beat generation and then generating/merging beats
per chunk; update any calling code that expects full-text output to handle
merged beat results and expose the maxChars as a configurable constant or
UrlToScriptOptions field so tests can control it.
- Around line 28-31: generateBasename can produce a trailing hyphen when title
is non-null but sanitizes to an empty string (e.g., "!!!"); update
generateBasename to call sanitizeBasename(title) and if the result is empty
(after slicing with TITLE_MAX_LENGTH) fall back to "article" before constructing
the `${dateStr}-${sanitizedTitle}` return value; reference function
generateBasename and sanitizeBasename and ensure TITLE_MAX_LENGTH slicing still
applies when choosing the fallback.

---

Nitpick comments:
In `@src/actions/url-to-script.ts`:
- Around line 64-96: In checkArticleQuality, after calling
extractResponseContent and JSON.parse, validate the parsed object shape (from
the parsed result returned by extractResponseContent/response) by ensuring
result has a boolean property "sufficient" and a string "reason" before using
them; if "sufficient" is missing or not a boolean or "reason" is missing/not a
string, throw a clear Error that includes the raw content (or the parsed object)
so we don't reject articles due to malformed LLM output, and use strict check
(result.sufficient === true) when deciding pass/fail.
- Around line 44-62: fetchArticle currently uses fragile casts for the agent
call and result; add a runtime shape validation after receiving result from
puppeteerCrawlerAgent.agent to ensure the returned object has a data property
with url, title and textContent (non-empty strings) and throw a clear Error if
validation fails; keep using agentFn = puppeteerCrawlerAgent.agent but replace
the unchecked cast and the simple (result as { data?: ArticleData }).data access
with a small inline type-guard/validation that asserts result && typeof result
=== 'object' && result.data && typeof result.data.textContent === 'string' (and
validate url/title similarly) before returning data, so fetchArticle fails fast
with a clear message if the agent shape changes.

Comment on lines +28 to +31
export function generateBasename(title: string | null, dateStr: string): string {
const sanitizedTitle = title ? sanitizeBasename(title).slice(0, TITLE_MAX_LENGTH) : "article";
return `${dateStr}-${sanitizedTitle}`;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Edge case: titles with only special characters produce a trailing-hyphen basename.

If title is non-null but contains only special characters (e.g., "!!!"), sanitizeBasename returns "", resulting in a basename like "20260217-". Consider falling back to "article" when the sanitized title is empty.

Proposed fix
 export function generateBasename(title: string | null, dateStr: string): string {
-  const sanitizedTitle = title ? sanitizeBasename(title).slice(0, TITLE_MAX_LENGTH) : "article";
+  const sanitizedTitle = (title ? sanitizeBasename(title).slice(0, TITLE_MAX_LENGTH) : "") || "article";
   return `${dateStr}-${sanitizedTitle}`;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export function generateBasename(title: string | null, dateStr: string): string {
const sanitizedTitle = title ? sanitizeBasename(title).slice(0, TITLE_MAX_LENGTH) : "article";
return `${dateStr}-${sanitizedTitle}`;
}
export function generateBasename(title: string | null, dateStr: string): string {
const sanitizedTitle = (title ? sanitizeBasename(title).slice(0, TITLE_MAX_LENGTH) : "") || "article";
return `${dateStr}-${sanitizedTitle}`;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/actions/url-to-script.ts` around lines 28 - 31, generateBasename can
produce a trailing hyphen when title is non-null but sanitizes to an empty
string (e.g., "!!!"); update generateBasename to call sanitizeBasename(title)
and if the result is empty (after slicing with TITLE_MAX_LENGTH) fall back to
"article" before constructing the `${dateStr}-${sanitizedTitle}` return value;
reference function generateBasename and sanitizeBasename and ensure
TITLE_MAX_LENGTH slicing still applies when choosing the fallback.

Comment on lines +98 to +122
function buildBeatsPrompt(articleText: string, options: UrlToScriptOptions): string {
const styleName = options.expression ?? "author";
const style = expressionStyles[styleName];
if (!style) {
throw new Error(
`Unknown expression style: ${styleName}. Available: ${EXPRESSION_NAMES.join(", ")}`
);
}

const lang = options.lang ?? "en";
const languageName = getLanguageName(lang);

const beatsHint = options.beats ? `\nTarget approximately ${options.beats} beats.` : "";

return `${style.systemPrompt}

## Language

Write all narration text in ${languageName} (lang code: "${lang}").
${beatsHint}

## Article Content

${articleText}`;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

No truncation of article text before sending to the LLM for beat generation.

buildBeatsPrompt appends the full articleText to the prompt (Line 121). For long articles, this could exceed the model's context window and cause API errors or silent truncation. The quality check (Line 83) truncates to 3000 chars, but the beat generation does not.

Consider truncating or chunking the article text to stay within safe token limits (e.g., ~30,000 characters for gpt-4o's context).

Proposed fix
+const ARTICLE_MAX_CHARS = 30000;
+
 function buildBeatsPrompt(articleText: string, options: UrlToScriptOptions): string {
   const styleName = options.expression ?? "author";
   const style = expressionStyles[styleName];
   if (!style) {
     throw new Error(
       `Unknown expression style: ${styleName}. Available: ${EXPRESSION_NAMES.join(", ")}`
     );
   }

   const lang = options.lang ?? "en";
   const languageName = getLanguageName(lang);

   const beatsHint = options.beats ? `\nTarget approximately ${options.beats} beats.` : "";

+  const truncatedArticle = articleText.slice(0, ARTICLE_MAX_CHARS);
+
   return `${style.systemPrompt}

 ## Language

 Write all narration text in ${languageName} (lang code: "${lang}").
 ${beatsHint}

 ## Article Content

-${articleText}`;
+${truncatedArticle}`;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function buildBeatsPrompt(articleText: string, options: UrlToScriptOptions): string {
const styleName = options.expression ?? "author";
const style = expressionStyles[styleName];
if (!style) {
throw new Error(
`Unknown expression style: ${styleName}. Available: ${EXPRESSION_NAMES.join(", ")}`
);
}
const lang = options.lang ?? "en";
const languageName = getLanguageName(lang);
const beatsHint = options.beats ? `\nTarget approximately ${options.beats} beats.` : "";
return `${style.systemPrompt}
## Language
Write all narration text in ${languageName} (lang code: "${lang}").
${beatsHint}
## Article Content
${articleText}`;
}
const ARTICLE_MAX_CHARS = 30000;
function buildBeatsPrompt(articleText: string, options: UrlToScriptOptions): string {
const styleName = options.expression ?? "author";
const style = expressionStyles[styleName];
if (!style) {
throw new Error(
`Unknown expression style: ${styleName}. Available: ${EXPRESSION_NAMES.join(", ")}`
);
}
const lang = options.lang ?? "en";
const languageName = getLanguageName(lang);
const beatsHint = options.beats ? `\nTarget approximately ${options.beats} beats.` : "";
const truncatedArticle = articleText.slice(0, ARTICLE_MAX_CHARS);
return `${style.systemPrompt}
## Language
Write all narration text in ${languageName} (lang code: "${lang}").
${beatsHint}
## Article Content
${truncatedArticle}`;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/actions/url-to-script.ts` around lines 98 - 122, buildBeatsPrompt
currently appends the entire articleText into the LLM prompt (risking
context-window overrun); change buildBeatsPrompt to limit the article content
included by either truncating articleText to a safe max length (e.g., a
configurable maxChars ~30000 or reusing the existing 3000-char quality
truncation) or by implementing deterministic chunking/summarization before beat
generation and then generating/merging beats per chunk; update any calling code
that expects full-text output to handle merged beat results and expose the
maxChars as a configurable constant or UrlToScriptOptions field so tests can
control it.

- Remove "meta" from beat instructions (MulmoScript doesn't have it)
- Move imagePrompt to beat-level field instead of inside image object
- Prefer markdown slides, use imagePrompt sparingly
- Update test assertion to match corrected prompt

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/utils/expression-styles.ts (1)

12-23: Example JSON only demonstrates the image variant; consider adding an imagePrompt beat example.

LLMs tend to mimic the structure of provided examples more faithfully than prose instructions. Since imagePrompt is a top-level beat field (not nested inside image), showing it once in the example would reduce the chance of the LLM placing it incorrectly or ignoring it entirely.

💡 Suggested addition to the example JSON
   "beats": [
     {
       "text": "<narration text for this beat>",
       "image": { "type": "markdown", "markdown": ["# Slide Title", "- point 1", "- point 2"] }
-    }
+    },
+    {
+      "text": "<narration text for this beat>",
+      "imagePrompt": "descriptive prompt for image generation"
+    }
   ]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/expression-styles.ts` around lines 12 - 23, The example JSON in
expression-styles.ts only shows the "image" variant inside the beats array,
which encourages LLMs to omit the top-level beat field "imagePrompt"; add a
second example beat inside "beats" demonstrating the top-level "imagePrompt"
field (e.g., a beat object with "text" and "imagePrompt") so examples include
both "image" and "imagePrompt" variants and clarify their placement relative to
the "beats" array and "image" object.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/utils/expression-styles.ts`:
- Around line 12-23: The example JSON in expression-styles.ts only shows the
"image" variant inside the beats array, which encourages LLMs to omit the
top-level beat field "imagePrompt"; add a second example beat inside "beats"
demonstrating the top-level "imagePrompt" field (e.g., a beat object with "text"
and "imagePrompt") so examples include both "image" and "imagePrompt" variants
and clarify their placement relative to the "beats" array and "image" object.

isamu and others added 4 commits February 19, 2026 13:40
Replace image.type="markdown" with image.type="slide" in the URL-to-MulmoScript
pipeline. The LLM now generates structured slide layouts (title, columns,
comparison, grid, stats, timeline, etc.) instead of raw markdown arrays.

- Add slide-schema.ts: converts Zod slide schemas to JSON Schema for LLM prompts
- Update expression-styles.ts: slide DSL output format with layout selection guide
- Update url-to-script.ts: inject schema into prompt, use slideStyles + mergeScripts
- Update --style to accept slide presets (slide_dark, slide_pop, etc.)
- Bump mulmocast to ^2.2.1 for slideStyles/mergeScripts exports

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Without a slide style, the generated MulmoScript has no slideParams.theme,
causing mulmocast to fail with "Slide theme is required". Now defaults to
slide_dark and also patches existing cached scripts missing slideParams.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@socket-security
Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Updatedmulmocast@​2.2.0 ⏵ 2.2.176 +1100100 +19870

View full report

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/test_slide_schema.ts (1)

38-42: Caching test verifies value equality, not actual cache hit.

assert.strictEqual on JavaScript string primitives uses ===, which compares by value — a rebuilt string with identical content would also pass. The comment "same cached reference" overstates what's being verified. The test is functionally fine (confirms deterministic output), but won't catch if cachedSchema is bypassed and the schema is rebuilt on each call.

To genuinely test cache behavior, consider spying on the internal build functions or counting call invocations.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_slide_schema.ts` around lines 38 - 42, The test currently uses
assert.strictEqual on string primitives from getSlideSchemaForPrompt which only
verifies value equality, not that cachedSchema was actually reused; update the
test to assert a real cache hit by either (a) spying/stubbing the internal
builder that constructs the schema (e.g., the function used by
getSlideSchemaForPrompt) and asserting it is called only once across two calls,
or (b) expose cachedSchema or change getSlideSchemaForPrompt to return a
non-primitive (object) so you can assert reference identity (strictEqual)
reliably; locate references to getSlideSchemaForPrompt and the internal
builder/cachedSchema in the module to implement the spy or exposure.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_expression_styles.ts`:
- Around line 85-93: Add a test for the edge case where the title sanitizes to
an empty string: call generateBasename with a title made only of special
characters (e.g., "!!!" or "()()!") and the date "20260217" and assert that the
returned basename does not end with a trailing dash but instead falls back to
the default slug (e.g., "20260217-article"); locate tests near the existing
generateBasename and sanitizeBasename usages in tests/test_expression_styles.ts
and add an assertion like assert.strictEqual(result, "20260217-article") to
ensure generateBasename handles empty sanitized titles.

---

Nitpick comments:
In `@tests/test_slide_schema.ts`:
- Around line 38-42: The test currently uses assert.strictEqual on string
primitives from getSlideSchemaForPrompt which only verifies value equality, not
that cachedSchema was actually reused; update the test to assert a real cache
hit by either (a) spying/stubbing the internal builder that constructs the
schema (e.g., the function used by getSlideSchemaForPrompt) and asserting it is
called only once across two calls, or (b) expose cachedSchema or change
getSlideSchemaForPrompt to return a non-primitive (object) so you can assert
reference identity (strictEqual) reliably; locate references to
getSlideSchemaForPrompt and the internal builder/cachedSchema in the module to
implement the spy or exposure.

Comment on lines +85 to +93
test("generateBasename: handles empty string title", () => {
const result = generateBasename("", "20260217");
assert.strictEqual(result, "20260217-article");
});

test("generateBasename: handles Japanese title", () => {
const result = generateBasename("日本語のタイトル", "20260217");
assert.ok(result.startsWith("20260217-"));
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Find the implementation files
find . -type f -name "*.ts" -o -name "*.js" | grep -E "(url-to-script|test_expression)" | head -20

Repository: receptron/MulmoCast-Slides

Length of output: 134


🏁 Script executed:

# Look for generateBasename and sanitizeBasename implementations
rg -n "generateBasename|sanitizeBasename" --type ts --type js -B 2 -A 10

Repository: receptron/MulmoCast-Slides

Length of output: 11863


🏁 Script executed:

# Examine the test file directly
cat -n tests/test_expression_styles.ts

Repository: receptron/MulmoCast-Slides

Length of output: 4448


Missing edge-case test for titles that sanitize to empty strings.

The Japanese title test at lines 90–93 uses a weak assertion that only checks result.startsWith("20260217-"), leaving it unable to catch if the title part becomes empty after sanitization. While Japanese characters are actually preserved by sanitizeBasename (so the test works correctly), a more serious gap exists: titles containing only special characters (e.g., "!!!", "()()!") would sanitize to empty strings, producing basenames like "20260217-" with trailing dashes—which are invalid filenames. No test currently covers this case.

💡 Suggested test addition
+test("generateBasename: handles title with only special characters", () => {
+  const result = generateBasename("!!!", "20260217");
+  // Should not produce trailing dash
+  assert.ok(!result.endsWith("-"), "result should not end with a trailing dash");
+  assert.notStrictEqual(result, "20260217-", "should not be date-dash only");
+});
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
test("generateBasename: handles empty string title", () => {
const result = generateBasename("", "20260217");
assert.strictEqual(result, "20260217-article");
});
test("generateBasename: handles Japanese title", () => {
const result = generateBasename("日本語のタイトル", "20260217");
assert.ok(result.startsWith("20260217-"));
});
test("generateBasename: handles empty string title", () => {
const result = generateBasename("", "20260217");
assert.strictEqual(result, "20260217-article");
});
test("generateBasename: handles Japanese title", () => {
const result = generateBasename("日本語のタイトル", "20260217");
assert.ok(result.startsWith("20260217-"));
});
test("generateBasename: handles title with only special characters", () => {
const result = generateBasename("!!!", "20260217");
// Should not produce trailing dash
assert.ok(!result.endsWith("-"), "result should not end with a trailing dash");
assert.notStrictEqual(result, "20260217-", "should not be date-dash only");
});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_expression_styles.ts` around lines 85 - 93, Add a test for the
edge case where the title sanitizes to an empty string: call generateBasename
with a title made only of special characters (e.g., "!!!" or "()()!") and the
date "20260217" and assert that the returned basename does not end with a
trailing dash but instead falls back to the default slug (e.g.,
"20260217-article"); locate tests near the existing generateBasename and
sanitizeBasename usages in tests/test_expression_styles.ts and add an assertion
like assert.strictEqual(result, "20260217-article") to ensure generateBasename
handles empty sanitized titles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant