Conversation
Summary by CodeRabbit
WalkthroughThe PR exposes MetadataContext and TocContext from the epub package initializer and adds corresponding dataclasses. read_metadata/read_toc now return (items, context) tuples and write_metadata/write_toc accept those contexts to perform in-place updates using XMLLikeNode. translator.py threads toc and metadata contexts through task generation and conditionally invokes write-back when contexts exist. xml/self_closing.py removes "meta" from the void tag set. xml/xml_like.py applies regex-based fixes to revert epub-prefixed link attributes to standard HTML attribute names. Tests updated to match new signatures and behaviors. Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 1 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@epub_translator/epub/metadata.py`:
- Around line 15-18: The comment for the MetadataContext dataclass uses a
fullwidth comma causing RUF003; update the inline comment for the xml_node field
to replace the fullwidth comma with a standard ASCII comma (i.e., change
"XMLLikeNode 对象,保留原始文件信息" to use a normal comma) so the comment for the opf_path
and xml_node fields is unambiguous and Ruff-compliant; locate the
MetadataContext dataclass and edit the comment on the xml_node field
accordingly.
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
epub_translator/xml/self_closing.py (2)
165-168:⚠️ Potential issue | 🟡 MinorDocstring example still mentions
<meta>despite exclusion.
unclose_void_elementsno longer processes<meta>because it’s not in_VOID_TAGS, so the example is now misleading.📝 Suggested docstring fix
- <meta charset="utf-8" /> → <meta charset="utf-8"> <br /> → <br> <img src="test.png" /> → <img src="test.png">
6-45:⚠️ Potential issue | 🟠 MajorInconsistency:
<meta>should be self-closed for XML parsing, but it's excluded from_VOID_TAGS.The tests expect
<meta charset="utf-8">→<meta charset="utf-8" />(lines 16, 113, 372-374), butself_close_void_elementswon't process<meta>tags because they're missing from_VOID_TAGS. When this function is called before XML parsing inxml_like.py:49, non-self-closed<meta>tags will cause parsing failures.The comment states "meta is excluded because OPF files have
<meta property="...">content</meta>" (a non-void element), but this conflates two different contexts: OPF metadata documents vs. HTML/XHTML content. Consider making<meta>handling context-aware or restoring it to_VOID_TAGSif only XHTML content is processed here.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@epub_translator/xml/xml_like.py`:
- Around line 224-226: The current unconditional loop that applies
_STANDARD_HTML_ATTRS to xml_string (the pattern.sub(...) block) removes
epub:type/epub:rel from all <link> elements; restrict this rewrite so it only
runs for Package Documents/container.xml contexts or when a caller-provided flag
indicates it's safe. Update the code that invokes the xml_string normalization
(or the function containing the loop) to accept a context parameter (e.g.,
"package" vs "content") or a boolean (e.g., allow_epub_prefixes) and only
execute the for pattern, replacement in _STANDARD_HTML_ATTRS: xml_string =
pattern.sub(replacement, xml_string) when the context is package/container or
the flag is true; alternatively, preserve attributes by tracking original
attribute namespaces and skip replacing attributes that originated from EPUB
Content Documents. Ensure you reference the _STANDARD_HTML_ATTRS list and the
xml_string normalization loop when making the change.
🧹 Nitpick comments (1)
epub_translator/xml/xml_like.py (1)
35-52: Add a regression test for the<link>attribute workaround.Given the regex-based post‑processing in Line 35-52, a small test covering
<link type="text/css" rel="stylesheet">(and a control element like<nav epub:type="toc">) would lock in the intended behavior and reduce regressions.
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/test_metadata.py (1)
16-77:⚠️ Potential issue | 🟠 MajorFix unused
contextvariables to clear lint failures.
Pylint flags unused-variable at Line 22, Line 58, and Line 69. Rename to_contextto keep CI green.🔧 Proposed fix
- metadata, context = read_metadata(zip_file) + metadata, _context = read_metadata(zip_file) ... - metadata, context = read_metadata(zip_file) + metadata, _context = read_metadata(zip_file) ... - metadata, context = read_metadata(zip_file) + metadata, _context = read_metadata(zip_file)
No description provided.