fix(archive): infer archive type via Magic Numbers instead of filename by balazs-szucs · Pull Request #79 · grimmory-tools/grimmory

balazs-szucs · 2026-03-20T15:20:10Z

📝 Description

This pull request refactors archive type detection throughout the codebase to consistently use content-based detection (via ArchiveUtils.detectArchiveType) instead of relying on file extensions. This improves robustness when handling comic book archives (CBZ, CBR, CB7), ensures correct MIME type assignment, and simplifies related code. The PR also updates tests and the development Docker setup for improved reliability and maintainability.

Simply put, fixes bugs where Reader or parts of the codebase would fail on Archieve where the underlying archive type vs filename were inconsistent

Required for develop and main. Your PR title must use Conventional Commit format because maintainers squash-merge with the PR title and stable releases are computed from commit history. Example: fix(reader): prevent blank pages on chapter jump

Linked Issue: Fixes #

Required. Every PR must reference an approved issue. If no issue exists, open one and wait for maintainer approval before submitting a PR. Unsolicited PRs without a linked issue will be closed.

🏷️ Type of Change

🔧 Changes

🧪 Testing (MANDATORY)

PRs without this section filled out will be closed. "Tests pass" or "Tested locally" is not sufficient. You must provide specifics.

Manual testing steps you performed:

Regression testing:

Edge cases covered:

Test output:

Backend test output (./gradlew test)

PASTE OUTPUT HERE

Frontend test output (ng test)

PASTE OUTPUT HERE

📸 Screen Recording / Screenshots (MANDATORY)

Every PR must include a screen recording or screenshots showing the change working end-to-end in a running local instance (both backend and frontend). This means you must have actually built, run, and tested the code yourself. PRs without visual proof will be closed without review.

✅ Pre-Submission Checklist

All boxes must be checked before requesting review. Incomplete PRs will be closed without review. No exceptions.

🤖 AI-Assisted Contributions

If any part of this PR was generated or assisted by AI tools (Copilot, Claude, ChatGPT, etc.), all items below are mandatory. You are fully responsible for every line you submit. "The AI wrote it" is not an excuse, and AI-generated PRs that clearly haven't been reviewed are the #1 reason PRs get closed.

I have read and understand every line of this PR and can explain any part of it during review
I personally ran the code and verified it works (not just trusted the AI's output)
PR is scoped to a single logical change, not a dump of everything the AI suggested
Tests validate actual behavior, not just coverage (AI-generated tests often assert nothing meaningful)
No dead code, placeholder comments, TODOs, or unused scaffolding left behind by AI
I did not submit refactors, style changes, or "improvements" the AI suggested beyond the scope of the issue

💬 Additional Context (optional)

Summary by CodeRabbit

Bug Fixes
- Enhanced comic archive format detection to identify file types based on actual archive contents rather than filename extensions for more reliable support.
Chores
- Updated CI/CD pipeline configurations and documentation.
- Improved Docker environment with enhanced archive handling support.

Dependabot couldn't find the original pull request head commit, ea510f4. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…immory-tools#2) Dependabot couldn't find the original pull request head commit, faed6bf. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…mory-tools#3) Dependabot couldn't find the original pull request head commit, f110823. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…ools#6) Dependabot couldn't find the original pull request head commit, 9a8d7a1. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…cover image retrieval

coderabbitai · 2026-03-20T15:20:27Z

📝 Walkthrough

Walkthrough

This pull request refactors archive-type detection across comic book services from filename-extension-based checks to using ArchiveUtils.detectArchiveType(). Changes include removing the public isSupportedCbxFormat method, updating MIME type detection, enhancing Docker build configuration with unrar support, and aligning tests with the new detection logic.

Changes

Cohort / File(s)	Summary
Archive Type Detection Refactoring `booklore-api/src/main/java/org/booklore/service/kobo/CbxConversionService.java`, `booklore-api/src/main/java/org/booklore/service/reader/CbxReaderService.java`, `booklore-api/src/main/java/org/booklore/service/opds/OpdsFeedService.java`	Replaced filename-extension-based validation with `ArchiveUtils.detectArchiveType()` for archive detection. Removed public `isSupportedCbxFormat(String fileName)` method. Updated control flows to switch on detected `ArchiveType` instead of suffix checks.
Test Suite Updates `booklore-api/src/test/java/org/booklore/service/kobo/CbxConversionServiceTest.java`, `booklore-api/src/test/java/org/booklore/service/opds/OpdsFeedServiceMimeTypeTest.java`, `booklore-api/src/test/java/org/booklore/service/opds/OpdsFeedServiceTest.java`	Removed tests for deleted `isSupportedCbxFormat` method. Updated MIME type tests to write real temporary files with proper magic bytes (RAR 4.x, 7z). Changed `.cbt` expected MIME type to `application/vnd.comicbook+zip`. Updated archive type from `UNKNOWN` to `RAR` in test setup.
Build & Configuration `dev.docker-compose.yml`, `CHANGELOG.md`	Replaced prebuilt Gradle image with inline multi-stage Docker build that includes unrar support via `linuxserver/unrar:7.1.10`. Added `libstdc++` and `libgcc` runtime dependencies. Added changelog entry for version 2.2.2 documenting CI permission fixes and release pipeline updates.
Log Message Update `booklore-api/src/main/java/org/booklore/service/fileprocessor/CbxProcessor.java`	Updated CBX cover-generation warning message to reference archive type generically (`'{}' archive`) instead of hardcoded type reference (`'{}' CBZ file`).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Hopping through archives with types so refined,
No more extension tricks left behind,
From .cbr to .cb7, now magic bytes lead the way,
With unrar's strength in the Docker to play,
Detection springs forth—accurate, clean, and bright! ✨📚

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The description is substantially incomplete. It lacks a linked issue reference (required), no checkboxes are marked for type of change, the Changes section is empty, and the critical Testing section with manual testing steps, regression testing, edge cases, and test output is entirely unfilled. The pre-submission checklist is not completed.	Complete all required sections: add the linked issue number in 'Fixes #', mark the appropriate change type checkbox, list specific changes made, provide detailed manual testing steps with actual test output, include regression testing results, describe edge cases tested, and check all pre-submission checklist items before requesting review.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: switching from filename-based to magic-number-based archive type detection.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

booklore-api/src/main/java/org/booklore/service/reader/CbxReaderService.java (1)

318-324: Consider caching the detected archive type to avoid redundant detection.

The archive type is detected again here even though it was already determined during scanArchiveMetadata(). Since CachedArchiveMetadata is already passed to this method, consider storing the ArchiveType in the cache to avoid re-reading the file's magic bytes on every page stream.

♻️ Optional: Cache archive type in metadata

 private static class CachedArchiveMetadata {
     final List<String> imageEntries;
     final long lastModified;
     final Charset successfulEncoding;
     final boolean useUnicodeExtraFields;
+    final ArchiveUtils.ArchiveType archiveType;
     volatile long lastAccessed;

-    CachedArchiveMetadata(List<String> imageEntries, long lastModified, Charset successfulEncoding, boolean useUnicodeExtraFields) {
+    CachedArchiveMetadata(List<String> imageEntries, long lastModified, Charset successfulEncoding, boolean useUnicodeExtraFields, ArchiveUtils.ArchiveType archiveType) {
         this.imageEntries = List.copyOf(imageEntries);
         this.lastModified = lastModified;
         this.successfulEncoding = successfulEncoding;
         this.useUnicodeExtraFields = useUnicodeExtraFields;
+        this.archiveType = archiveType;
         this.lastAccessed = System.currentTimeMillis();
     }
 }

Then use metadata.archiveType in streamEntryFromArchive() instead of re-detecting.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@booklore-api/src/main/java/org/booklore/service/reader/CbxReaderService.java`
around lines 318 - 324, The code redundantly re-detects the archive type using
ArchiveUtils.detectArchiveType(cbxPath.toFile()) in CbxReaderService (inside the
streamEntryFromArchive / switch block); update the caching logic so
scanArchiveMetadata() stores the detected ArchiveUtils.ArchiveType in
CachedArchiveMetadata (e.g., metadata.archiveType) and change the switch to use
metadata.archiveType instead of calling detectArchiveType again; ensure
CachedArchiveMetadata is populated when initially scanning and add null/unknown
handling in streamEntryFromArchive to fall back to detection only if
metadata.archiveType is missing.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@CHANGELOG.md`:
- Around line 1-11: The changelog entry for 2.2.2 is incorrect/missing the
archive-detection changes introduced in this PR; update CHANGELOG.md to either
(A) add a new release section (e.g., 2.2.3 with date) describing the bugfixes
and improvements for ArchiveUtils.detectArchiveType(), comic book archive
handling (CBZ, CBR, CB7) and MIME/magic-number based detection, or (B) replace
the 2.2.2 entry content with those detailed changes if this PR is intended to be
part of 2.2.2; mention the specific symbols and features changed
(ArchiveUtils.detectArchiveType, CBZ/CBR/CB7 handling, MIME type
detection/magic-number logic) and include concise bullet points summarizing the
fixes and any user-facing behavior changes.

---

Nitpick comments:
In
`@booklore-api/src/main/java/org/booklore/service/reader/CbxReaderService.java`:
- Around line 318-324: The code redundantly re-detects the archive type using
ArchiveUtils.detectArchiveType(cbxPath.toFile()) in CbxReaderService (inside the
streamEntryFromArchive / switch block); update the caching logic so
scanArchiveMetadata() stores the detected ArchiveUtils.ArchiveType in
CachedArchiveMetadata (e.g., metadata.archiveType) and change the switch to use
metadata.archiveType instead of calling detectArchiveType again; ensure
CachedArchiveMetadata is populated when initially scanning and add null/unknown
handling in streamEntryFromArchive to fall back to detection only if
metadata.archiveType is missing.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c14b689f-cc47-4d9c-a755-241245805ab4

📥 Commits

Reviewing files that changed from the base of the PR and between 6ef4448 and f5986d5.

📒 Files selected for processing (9)

CHANGELOG.md
booklore-api/src/main/java/org/booklore/service/fileprocessor/CbxProcessor.java
booklore-api/src/main/java/org/booklore/service/kobo/CbxConversionService.java
booklore-api/src/main/java/org/booklore/service/opds/OpdsFeedService.java
booklore-api/src/main/java/org/booklore/service/reader/CbxReaderService.java
booklore-api/src/test/java/org/booklore/service/kobo/CbxConversionServiceTest.java
booklore-api/src/test/java/org/booklore/service/opds/OpdsFeedServiceMimeTypeTest.java
booklore-api/src/test/java/org/booklore/service/opds/OpdsFeedServiceTest.java
dev.docker-compose.yml

💤 Files with no reviewable changes (1)

booklore-api/src/test/java/org/booklore/service/kobo/CbxConversionServiceTest.java

CHANGELOG.md

# Conflicts: # .github/workflows/preview-image.yml

dependabot bot and others added 6 commits March 19, 2026 19:46

chore(deps): bump actions/setup-node from 6.2.0 to 6.3.0 (#1)

2158595

Dependabot couldn't find the original pull request head commit, ea510f4. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

chore(deps): bump docker/setup-buildx-action from 3.12.0 to 4.0.0 (gr…

c157bf2

…immory-tools#2) Dependabot couldn't find the original pull request head commit, faed6bf. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

chore(deps): bump docker/build-push-action from 6.19.2 to 7.0.0 (grim…

54fa7f7

…mory-tools#3) Dependabot couldn't find the original pull request head commit, f110823. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

chore(deps): bump docker/login-action from 3.7.0 to 4.0.0 (grimmory-t…

6ef4448

…ools#6) Dependabot couldn't find the original pull request head commit, 9a8d7a1. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

fix(archive): infer archive type via Magic Numbers instead of filename

764f2b2

fix(archive): improve archive type detection and improve logging for …

f18723b

…cover image retrieval

Merge branch 'develop' into archive-type-detection

f5986d5

coderabbitai bot reviewed Mar 20, 2026

View reviewed changes

CHANGELOG.md Show resolved Hide resolved

imajes force-pushed the develop branch 2 times, most recently from 89113d4 to 37ca101 Compare March 20, 2026 22:20

Merge branch 'develop' into archive-type-detection

3d6d794

# Conflicts: # .github/workflows/preview-image.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(archive): infer archive type via Magic Numbers instead of filename#79

fix(archive): infer archive type via Magic Numbers instead of filename#79
balazs-szucs wants to merge 8 commits intogrimmory-tools:developfrom
balazs-szucs:archive-type-detection

balazs-szucs commented Mar 20, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 20, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

balazs-szucs commented Mar 20, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📝 Description

🏷️ Type of Change

🔧 Changes

🧪 Testing (MANDATORY)

📸 Screen Recording / Screenshots (MANDATORY)

✅ Pre-Submission Checklist

🤖 AI-Assisted Contributions

💬 Additional Context (optional)

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

balazs-szucs commented Mar 20, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 20, 2026 •

edited

Loading