fix(metadata): fix metadata fetching relying for filename "first" instead of better fields like ISBN in Goodreads/Bookdrop by balazs-szucs · Pull Request #85 · grimmory-tools/grimmory

balazs-szucs · 2026-03-20T20:06:04Z

📝 Description

This pull request introduces improvements to metadata fetching and lookup logic for book files, especially around the handling of insufficient metadata and ISBN-based lookups for Goodreads and bookdrop. Normally, you would wanna query via the "most specific" information possible. This, however, with Goodreads parser wasn't the case. It'd premateraly fallback to filenames which were in no-way accurate way to find bookmetata

Required for develop and main. Your PR title must use Conventional Commit format because maintainers squash-merge with the PR title and stable releases are computed from commit history. Example: fix(reader): prevent blank pages on chapter jump

Linked Issue: Fixes #

Required. Every PR must reference an approved issue. If no issue exists, open one and wait for maintainer approval before submitting a PR. Unsolicited PRs without a linked issue will be closed.

🏷️ Type of Change

🔧 Changes

🧪 Testing (MANDATORY)

PRs without this section filled out will be closed. "Tests pass" or "Tested locally" is not sufficient. You must provide specifics.

Manual testing steps you performed:

Regression testing:

Edge cases covered:

Test output:

Backend test output (./gradlew test)

PASTE OUTPUT HERE

Frontend test output (ng test)

PASTE OUTPUT HERE

📸 Screen Recording / Screenshots (MANDATORY)

Every PR must include a screen recording or screenshots showing the change working end-to-end in a running local instance (both backend and frontend). This means you must have actually built, run, and tested the code yourself. PRs without visual proof will be closed without review.

✅ Pre-Submission Checklist

All boxes must be checked before requesting review. Incomplete PRs will be closed without review. No exceptions.

🤖 AI-Assisted Contributions

If any part of this PR was generated or assisted by AI tools (Copilot, Claude, ChatGPT, etc.), all items below are mandatory. You are fully responsible for every line you submit. "The AI wrote it" is not an excuse, and AI-generated PRs that clearly haven't been reviewed are the #1 reason PRs get closed.

I have read and understand every line of this PR and can explain any part of it during review
I personally ran the code and verified it works (not just trusted the AI's output)
PR is scoped to a single logical change, not a dump of everything the AI suggested
Tests validate actual behavior, not just coverage (AI-generated tests often assert nothing meaningful)
No dead code, placeholder comments, TODOs, or unused scaffolding left behind by AI
I did not submit refactors, style changes, or "improvements" the AI suggested beyond the scope of the issue

💬 Additional Context (optional)

Summary by CodeRabbit

New Features
- Added ISBN-based metadata lookup for enhanced book identification.
Improvements
- Enhanced metadata handling to flag items with insufficient identifiers as pending review, reducing unnecessary online lookups.

Dependabot couldn't find the original pull request head commit, ea510f4. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…immory-tools#2) Dependabot couldn't find the original pull request head commit, faed6bf. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…mory-tools#3) Dependabot couldn't find the original pull request head commit, f110823. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…ools#6) Dependabot couldn't find the original pull request head commit, 9a8d7a1. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…tead of better fields like ISBN in Goodreads/Bookdrop

- Image builds: - pin frontend and backend Docker build stages to the build platform so multi-arch packaging can reuse architecture-independent work - add an Angular build cache mount and move dynamic version metadata to the end of the runtime stage so static layers stay reusable across tags - reduce image workflow checkout depth and keep preview builds on `linux/amd64` only to avoid unnecessary QEMU and history overhead - Cache policy: - make CI packaging smoke builds consume the shared image cache without writing new BuildKit cache state - make normal preview builds consume shared GHA and registry caches without mutating the canonical cache - keep nightly and stable release builds as the workflows that refresh the shared image cache in both GHA and registry backends - Preview override: - add a `refresh_shared_cache` input to the preview workflow for an explicit no-cache rebuild that repopulates the shared cache when maintainers need to bust and refresh it - keep the default preview behavior optimized for fast disposable builds rather than cache churn - Validation: - keep workflow YAML parsing clean after the cache-policy changes - keep `git diff --check` clean for the touched Docker and workflow files

coderabbitai · 2026-03-20T20:06:24Z

📝 Walkthrough

Walkthrough

Version 2.2.2 release documentation added to CHANGELOG. BookdropMetadataService refactored to conditionally skip online metadata refresh for non-searchable entries. GoodReadsParser enhanced with ISBN-based lookup path and centralized ID extraction helper.

Changes

Cohort / File(s)	Summary
Release Documentation `CHANGELOG.md`	Version 2.2.2 entry added documenting CI permission forwarding fix, release pipeline rebranding, and README link update.
Metadata Service Logic `booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java`	Added conditional metadata refresh skip logic via `hasSearchableMetadata()` and `hasAnyKnownIdentifier()` helpers. Non-searchable metadata now marked `PENDING_REVIEW` without fetching remote data. Removed `primaryFile` field from `Book` construction.
Parser Enhancement `booklore-api/src/main/java/org/booklore/service/metadata/parser/GoodReadsParser.java`	Added ISBN-based lookup path in `fetchTopMetadata()` with new `extractGoodreadsIdFromOgUrl()` helper. Refactored existing ISBN handling to use centralized ID extraction logic.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Hop, hop, through metadata we go,
ISBN paths, now shine and glow!
Searchable treasures, pending review,
Code hops forward with logic anew! ✨📚

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The description provides a brief explanation of the change but is severely incomplete: no linked issue, type of change unchecked, changes/testing sections unfilled, no test output, and no required screen recording/screenshots.	Complete required sections: link an approved issue, check the type of change, fill testing steps with specifics, provide actual test output, and attach screen recording or screenshots proving the change works end-to-end.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly describes the main fix: improving metadata fetching to prioritize ISBN over filenames in Goodreads/Bookdrop lookups, and follows Conventional Commits format.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (3)

CHANGELOG.md (1)

1-12: Move # Changelog back to the top of the file.

The top-level heading should be the first line for consistent changelog structure/readability (and better compatibility with tooling that expects it there).

Proposed reorder

-## [2.2.2](https://github.com/grimmory-tools/grimmory/compare/v2.2.1...v2.2.2) (2026-03-19)
-
-### Bug Fixes
-
-* **ci:** forward checks permission to reusable test jobs ([e3b3e7c](https://github.com/grimmory-tools/grimmory/commit/e3b3e7c4d7aae23f517b6029af56df360d40edf1))
-
-### Chores
-
-* **ci:** rebrand and streamline the release pipeline ([`#66`](https://github.com/grimmory-tools/grimmory/issues/66)) ([1bc45da](https://github.com/grimmory-tools/grimmory/commit/1bc45da013415dc66621dc72eb21be9640028f4d))
-* Update README to use new links, and clean up formatting ([6b0c4aa](https://github.com/grimmory-tools/grimmory/commit/6b0c4aa66c2247de0335340dbae8ec0afc1fbd59))
-
 # Changelog
 
 All notable Grimmory releases are tracked here.
 
 This file is maintained by semantic-release.
+
+## [2.2.2](https://github.com/grimmory-tools/grimmory/compare/v2.2.1...v2.2.2) (2026-03-19)
+
+### Bug Fixes
+
+* **ci:** forward checks permission to reusable test jobs ([e3b3e7c](https://github.com/grimmory-tools/grimmory/commit/e3b3e7c4d7aae23f517b6029af56df360d40edf1))
+
+### Chores
+
+* **ci:** rebrand and streamline the release pipeline ([`#66`](https://github.com/grimmory-tools/grimmory/issues/66)) ([1bc45da](https://github.com/grimmory-tools/grimmory/commit/1bc45da013415dc66621dc72eb21be9640028f4d))
+* Update README to use new links, and clean up formatting ([6b0c4aa](https://github.com/grimmory-tools/grimmory/commit/6b0c4aa66c2247de0335340dbae8ec0afc1fbd59))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@CHANGELOG.md` around lines 1 - 12, Move the top-level heading "# Changelog"
to the very first line of the file so the file begins with that heading,
followed by a single blank line and then the existing release entries (starting
with "## [2.2.2]..."). Ensure the "# Changelog" line is removed from its current
location and placed above the "## [2.2.2]" section so tooling and readability
are consistent.

booklore-api/src/main/java/org/booklore/service/metadata/parser/GoodReadsParser.java (1)

137-148: Good refactor to use the centralized helper.

The refactoring to use extractGoodreadsIdFromOgUrl is consistent with the new approach in fetchTopMetadata.

However, note that unlike fetchTopMetadata (lines 74-86), there's no try-catch around the fetchDoc() call on line 140. If fetchDoc() throws a RuntimeException (which it does on IOException), the entire method fails instead of falling back to the preview search. Consider wrapping this block in a try-catch for consistency.

♻️ Optional: Add error handling for consistency with fetchTopMetadata

         String isbn = ParserUtils.cleanIsbn(fetchMetadataRequest.getIsbn());
         if (isbn != null && !isbn.isBlank()) {
             log.info("Goodreads Query URL (ISBN): {}{}", BASE_ISBN_URL, isbn);
-            Document doc = fetchDoc(BASE_ISBN_URL + isbn);
-            String goodreadsId = extractGoodreadsIdFromOgUrl(doc);
-            if (goodreadsId != null) {
-                BookMetadata metadata = parseBookDetails(doc, goodreadsId);
-                if (metadata != null) {
-                    return List.of(metadata);
+            try {
+                Document doc = fetchDoc(BASE_ISBN_URL + isbn);
+                String goodreadsId = extractGoodreadsIdFromOgUrl(doc);
+                if (goodreadsId != null) {
+                    BookMetadata metadata = parseBookDetails(doc, goodreadsId);
+                    if (metadata != null) {
+                        return List.of(metadata);
+                    }
                 }
+            } catch (Exception e) {
+                log.warn("GoodReads: ISBN lookup failed: {}, falling back to preview search", e.getMessage());
             }
         }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@booklore-api/src/main/java/org/booklore/service/metadata/parser/GoodReadsParser.java`
around lines 137 - 148, Wrap the ISBN-path in GoodReadsParser so the call to
fetchDoc(BASE_ISBN_URL + isbn) and the subsequent
extractGoodreadsIdFromOgUrl/parseBookDetails logic are enclosed in a try-catch
that catches RuntimeException (the same way fetchTopMetadata does); on exception
log the error (including the ISBN and exception) and allow the method to
continue so it falls back to the preview/search flow instead of propagating the
RuntimeException. Ensure you reference the existing helper methods
extractGoodreadsIdFromOgUrl and parseBookDetails when adding the try-catch so
behavior remains identical on success.

booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java (1)

118-127: Maintenance note: Keep synchronized with BookMetadata.

The comment on line 119 is helpful. The explicit list is clear and readable, though it does require manual updates when new provider IDs are added to BookMetadata.

For future consideration, an annotation-based approach or a dedicated method on BookMetadata (e.g., hasAnyIdentifier()) could reduce the maintenance burden, but this is fine for now.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java`
around lines 118 - 127, Replace the manual list check in hasAnyKnownIdentifier
with a delegated method on the BookMetadata model: implement a public boolean
hasAnyIdentifier() in BookMetadata that checks all provider ID fields (same list
currently used: getIsbn13, getIsbn10, getAsin, getGoodreadsId, getGoogleId,
getHardcoverId, getHardcoverBookId, getComicvineId, getDoubanId,
getLubimyczytacId, getRanobedbId, getAudibleId) and then update
BookdropMetadataService.hasAnyKnownIdentifier(BookMetadata m) to simply return
m.hasAnyIdentifier(); this centralizes the identifier-maintenance surface and
keeps the comment in sync.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In
`@booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java`:
- Around line 118-127: Replace the manual list check in hasAnyKnownIdentifier
with a delegated method on the BookMetadata model: implement a public boolean
hasAnyIdentifier() in BookMetadata that checks all provider ID fields (same list
currently used: getIsbn13, getIsbn10, getAsin, getGoodreadsId, getGoogleId,
getHardcoverId, getHardcoverBookId, getComicvineId, getDoubanId,
getLubimyczytacId, getRanobedbId, getAudibleId) and then update
BookdropMetadataService.hasAnyKnownIdentifier(BookMetadata m) to simply return
m.hasAnyIdentifier(); this centralizes the identifier-maintenance surface and
keeps the comment in sync.

In
`@booklore-api/src/main/java/org/booklore/service/metadata/parser/GoodReadsParser.java`:
- Around line 137-148: Wrap the ISBN-path in GoodReadsParser so the call to
fetchDoc(BASE_ISBN_URL + isbn) and the subsequent
extractGoodreadsIdFromOgUrl/parseBookDetails logic are enclosed in a try-catch
that catches RuntimeException (the same way fetchTopMetadata does); on exception
log the error (including the ISBN and exception) and allow the method to
continue so it falls back to the preview/search flow instead of propagating the
RuntimeException. Ensure you reference the existing helper methods
extractGoodreadsIdFromOgUrl and parseBookDetails when adding the try-catch so
behavior remains identical on success.

In `@CHANGELOG.md`:
- Around line 1-12: Move the top-level heading "# Changelog" to the very first
line of the file so the file begins with that heading, followed by a single
blank line and then the existing release entries (starting with "##
[2.2.2]..."). Ensure the "# Changelog" line is removed from its current location
and placed above the "## [2.2.2]" section so tooling and readability are
consistent.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b3956752-f3e5-4e24-ad0c-3ef34a067cb8

📥 Commits

Reviewing files that changed from the base of the PR and between 926088c and d0150e1.

📒 Files selected for processing (3)

CHANGELOG.md
booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java
booklore-api/src/main/java/org/booklore/service/metadata/parser/GoodReadsParser.java

dependabot bot and others added 8 commits March 19, 2026 19:46

chore(deps): bump actions/setup-node from 6.2.0 to 6.3.0 (#1)

2158595

Dependabot couldn't find the original pull request head commit, ea510f4. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

chore(deps): bump docker/setup-buildx-action from 3.12.0 to 4.0.0 (gr…

c157bf2

…immory-tools#2) Dependabot couldn't find the original pull request head commit, faed6bf. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

chore(deps): bump docker/build-push-action from 6.19.2 to 7.0.0 (grim…

54fa7f7

…mory-tools#3) Dependabot couldn't find the original pull request head commit, f110823. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

chore(deps): bump docker/login-action from 3.7.0 to 4.0.0 (grimmory-t…

6ef4448

…ools#6) Dependabot couldn't find the original pull request head commit, 9a8d7a1. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

fix(metadata): fix metadata fetching relying for filename "first" ins…

e454b7c

…tead of better fields like ISBN in Goodreads/Bookdrop

docs: Add a note about how to make a release

926088c

Merge branch 'develop' into bookdrop-metadata-fix

d0150e1

coderabbitai bot reviewed Mar 20, 2026

View reviewed changes

imajes force-pushed the develop branch 2 times, most recently from 89113d4 to 37ca101 Compare March 20, 2026 22:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(metadata): fix metadata fetching relying for filename "first" instead of better fields like ISBN in Goodreads/Bookdrop#85

fix(metadata): fix metadata fetching relying for filename "first" instead of better fields like ISBN in Goodreads/Bookdrop#85
balazs-szucs wants to merge 8 commits intogrimmory-tools:developfrom
balazs-szucs:bookdrop-metadata-fix

balazs-szucs commented Mar 20, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 20, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

balazs-szucs commented Mar 20, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📝 Description

🏷️ Type of Change

🔧 Changes

🧪 Testing (MANDATORY)

📸 Screen Recording / Screenshots (MANDATORY)

✅ Pre-Submission Checklist

🤖 AI-Assisted Contributions

💬 Additional Context (optional)

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

balazs-szucs commented Mar 20, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 20, 2026 •

edited

Loading