Closed
Conversation
Add deduplication logic in Otzaria generator to skip books that already exist from Sefaria source. Sefaria has priority over other sources.
…tion - Add heRef column to book table with index for efficient lookups - Update Book model with heRef property and KDoc documentation - Add selectByHeRef query and getBookByHeRef repository method - Update insert queries to include heRef parameter - Set heRef from title in Otzaria generator - Set heRef from heTitle in Sefaria generator This allows stable identification of books across database regenerations, similar to how line.heRef works for lines.
feat(book): add heRef property for stable Hebrew reference identification
feat(generator): skip duplicate books when Sefaria version exists
Add post-processing script to rename Sefaria categories after import, aligning with Otzaria's naming convention. Runs automatically as part of generateSeforimDb pipeline, right after Sefaria import and before Otzaria append. Usage: ./gradlew :sefariasqlite:renameCategories -PseforimDb=/path/to/seforim.db Closes kdroidFilter/Zayit#247
fix: rename פירושים מודרניים categories to מחברי זמננו
Sefaria's table_of_contents.json doesn't provide an 'order' field for commentary books. Instead, it provides 'base_text_order' which indicates the order of the base text being commented on. Changes: - Parse base_text_order as fallback when order is not available - Look up book order by Hebrew title in addition to English title This fixes the issue where most commentary books had orderIndex=999. Closes kdroidFilter/Zayit#263
…order fix: use base_text_order for commentary book ordering
Categories that have no books and no non-empty subcategories are now excluded from the precomputed catalog. This fixes the issue where blacklisted books would leave empty category folders in the tree. The fix works recursively: if a category's only children are themselves empty, the parent category is also excluded.
…-catalog fix: skip empty categories when building catalog
Extend RenameCategoriesPostProcess to handle both renaming and merging: - Simple rename when no target category exists under the same parent - Merge books and subcategories when target category already exists Mappings added: - פירושים מודרניים על התנ״ך/התלמוד/המשנה -> מחברי זמננו - ספרות מודרנית -> מחברי זמננו (merge under הלכה, מחשבת ישראל, ספרי מוסר, שו״ת) - ראשונים על התנ״ך -> ראשונים (merge under תנ״ך) - אחרונים על התנ״ך -> אחרונים (merge under תנ״ך) - ראשונים/אחרונים על התלמוד -> ראשונים/אחרונים (rename under בבלי) - ראשונים/אחרונים על המשנה -> ראשונים/אחרונים (rename under משנה) This ensures consistent category structure between Sefaria and Otzaria sources.
…tegories feat: unify Sefaria/Otzaria category naming with merge support
- Add ancestorCategoryIds to Lucene index for instant category filtering - Add baseBookOnly parameter to openSession() and computeFacets() - Filter by is_base_book directly in Lucene instead of fetching all IDs - Add getAncestorCategoryIds() to SeforimRepository
…g-basebook feat(search): add baseBookOnly filter and ancestorCategoryIds indexing
Extract an interface from SeforimRepository to allow mocking in unit tests. The interface includes methods needed for line selection and navigation: - getHeadingTocEntryByLineId - getLineIdsForTocEntry - getTocEntryIdForLine - getTocEntry - getLine - getPreviousLine - getNextLine - getLines
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary