Replacing book files causes duplicate DB entries #221
Replies: 5 comments
-
|
First you have some kind of idea that all of books entry are all ID that point to a database. It is not. These are The
This is where you are making a crucial mistake. The idea of Change File Link is to replace the entry BEFORE it is in the database. Don't add it via scanning. Replace the file without adding it, you don't get 2 entries in that case. It replaces the Your way of doing thing means that you have imported a duplicate. This might be what some users want that do keep duplicate. They are 2 distinct file so in this regard that's expected behavior. But you might have explained how using Change File Link do causes duplicate, it's because the files were added before using it. Thank for providing an explication. What people usually do is have some kind of staging folder where you import files. They go into the Folder tab and scrape them, add to the library etc. Then they use Library Organizer to move them to the required folder. Yes the solution would be an Add-on. Probably updating Change File Link to search the Library and check that the chosen book isn't already in the library. Library Organizer already does something like that. def find_duplicate_book(self, path):
"""
Trys to find a book in the CR library via a path
"""
for book in ComicRack.App.GetLibraryBooks():
if book.FilePath == path:
return book
return NoneYou would need to have a dialog asking to remove the new entry or not, would need to be careful, because what if that new file is also in another list. Removing it would end up in a file not found. |
Beta Was this translation helpful? Give feedback.
-
|
Ah, these are great points. I know from experience that moving a file that's already in the DB will usually update its path on the next scan, so yea, changing the link before adding to the DB should bypass the duplicate entry being created. I do use a staging folder of sorts where freshly downloaded files go, and it doesn't get scanned. I use the Folders tab a lot to create Lists, since I usually find the file browser interface easier to navigate instead of running searches on the Library as a whole.
And of course, thanks for the response and the help and the clarification. I don't have enough knowledge to dig into the codebase and really understand what's happening under the hood, so most of this was based on what little I could glean from the process itself and digging into the XML (esp since I don't know how to dig into the SQL DB manually either 😅). So lots of assumptions, but I'm glad it's starting to piece together in my brain. |
Beta Was this translation helpful? Give feedback.
-
They are still What you are describing is exactly what the Duplicate Manager plugin does. It already has mentions for c2c, noads, etc. It seems a little bit hard to configure. But that is what you want or at least to use as inspiration. I am not sure if it will fix your list issue. But it might be worth a look. The problem is always how do you decide which files are duplicates because they might have different read progress, some might be newly imported with no metadata etc. Automating that is what is difficult. Issue #153 as the same issue. The fix could be easy using Change File Link to check for duplicate using the above code and if the chosen file is already in the database remove the entry in the database ( Maybe there is a use for See the wiki page for a list of API that plugins can use including
If you use the backup database while connected to a SQL db it now outputs the complete ComicDB.xml that includes the books. It can be use to switch to a regular XML database or to restore that SQL db. You can use that just to see how it's formatted. Underneath it's still all just XML even in SQL. |
Beta Was this translation helpful? Give feedback.
-
|
Yo, this is all such great info. Really, thanks for taking the time to dig into this with me. I did take a look at Duplicates Manager, but you're right: it doesn't seem to do anything with Lists, and the rulesets don't seem to be quite broad enough. There's only a few properties that it can catch, so while it can work for some situations, it would't for others. I'm taking a closer look at Library Organizer as well, but based on the logic I see, it also wouldn't do anything with Lists (I could be wrong). It definitely has a trigger for finding duplicate books and give an option to replace in the DB, but I don't think it replaces all references to the Object across everything? Maybe I'm wrong, still testing that out. Still, this has given me a starting point to work on digging myself out of the hole, and hopefully implementing a way to avoid it in the future? I'm still learning Python and programming in general, but this could be a project to learn with. |
Beta Was this translation helpful? Give feedback.
-
I am not sure also, but based on what I see it seems so. It removes the old file before moving the new and removes that old file from the db. But it's probably only if the file already exists, in your case if only using it to move to a folder without renaming would probably not work. elif self.duplicate_action is DuplicateAction.Overwrite:
try:
if self.profile.CopyReadPercentage and type(oldbook) is not FileInfo:
book.LastPageRead = oldbook.LastPageRead
FileIO.FileSystem.DeleteFile(undo_path, FileIO.UIOption.OnlyErrorDialogs, FileIO.RecycleOption.SendToRecycleBin)
except Exception, ex:
self.logger.Add("Failed", self.report_book_name, "Failed to overwrite " + undo_path + ". The error was: " + str(ex))
return MoveResult.Failed
#Since we are only working with images there is no need to remove a book from the library
if type(oldbook) is not FileInfo:
ComicRack.App.RemoveBook(oldbook)Also a tip if you are new to python. This isn't python as such it is IronPython a mix of .NET and python. Which is why you see things like FileIO.FileSystem.DeleteFile. You can usually use either python functions or .NET functions. But .NET is very specific about types and python isn't. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Not really a bug or a feature request, but putting this in here by request. Hopefully can spitball some solutions.
First, the caveats:
-No editing of book files (CBR/CBZ/etc) allowed. This includes renaming, embedding meta, etc. Moving of files to different folders/paths is fine, but the file itself remains untouched (same filename and hash).
-This whole thing is related to static Lists, not SmartLists: SmartLists solve the problem, but come with their own quirks and issues, and are outside scope
-This is a deliberatly obtuse edge case example to illustrate the behaviour in question
-Personally, I don't use add-ons like Database Manager or Library Organizer, so I don't know if/how they approach problems like this. Maybe there's already a solution out there for this.
-In my case I'm using an SQL database, so there's a little bit more fragmentation involved.
Situation:
Let's say we have a series/volume: we'll call it SERIES X. It's a 6-issue run, sorted into a watched folder that matches the series name. Example filenames:
Series X [V2020]/Series X - 01 (Digital) (Scanner).cbzSeries X [V2020]/Series X - 02 (Digital) (Scanner).cbzEtcIn CRCE, these books have been added to the library via the scanning/watched folder process (so assigned GUID's) and metadata tagged via ComicVine. For the sake of example, let's also say the GUID's are sequential, so Issue 1 is
ID1, Issue 2 isID2, etc.The books are also added to Lists (Static/Dumb List, not SmartList).
-Issues 1-12 are in a List corresponding to the Volume:
Series X [V2020]-Issues 2-4 are also part of a crossover/event, so are on another List for that event:
Event Y [2019]-All 6 issues are also on a List for a reading order for the universe as a whole:
Publisher Z ChronologicalSo let's break that all down: <Series #> (<lists it's included on>)
<GUID>Series X 1 (
Series X [V2020],Publisher Z Chronological)<ID1>Series X 2 (
Series X [V2020],Publisher Z Chronological,Event Y [2019])<ID2>Series X 3 (
Series X [V2020],Publisher Z Chronological,Event Y [2019])<ID3>Series X 4 (
Series X [V2020],Publisher Z Chronological,Event Y [2019])<ID4>Series X 5 (
Series X [V2020],Publisher Z Chronological<ID5>Series X 6 (
Series X [V2020],Publisher Z Chronological<ID6>As these are all static lists, they're held in ComicDB.xml as just a list of GUIDs: no other reference point.
Now, say we get a new file: a fixed version of Issue 2. It has filename
Series X - 02 (Digital) (Fixed) (Scanner).cbz, and we want to replace the original.So you (or some utility like Library Organizer):
-Go into the folder:
Series X [V2020]-Delete the original file:
Series X - 02 (Digital) (Scanner).cbz-Paste in the new Fixed file:
Series X - 02 (Digital) (Fixed) (Scanner).cbzOn the next scan, the new file is added to the library and assigned a GUID (let's say
ID50). This step could be done before or after the actual file move, doesn't matter.Upon loading any of the lists that contained the previous file, it'll show as FILE NOT FOUND. If you want to update the book in each of those lists, you can:
(A) Identify which lists the original file was in: right-click>Show In List, then drag-and-drop the new book from the library into those lists and do whatever sorting/organizing you need to.
(B) Hacky option: use the Change File Link script to point ID2 to the new file
Option (A) could be time-consuming (depending on the number of Lists it's contained in) and still requires fiddling with order if you have a custom sort, but would fully replace the old file with the new one.
Option (B) just replaces the file across all Lists (sorcery), but the problem is that both
ID2andID50both still exist in the database: they're just pointing to the same file.You could remove
ID2from the Library, but would also remove the book from Lists, defeating the purpose.You could remove
ID50, but it would re-add to the library on the next scan unless you have the 'Files manually removed from the Library will not be added again' option enabled in the scan settings, which you might not want for whatever reason.Now, as to possible solutions.
The most direct one I can think of is literally just a find->replace of the GUID in ComicDB.xml. This could technically even be done externally in a text editor regardless of DB state, since lists are plaintext in the XML file.
Ideally, this would use some kind of GUI to identify the files to be replaced (Original) and be replaced with (Fixed) using the GUIDs, and would copy all the existing metadata from Original to Fixed, but that feels maybe out of scope. Either way, I could envision the UI working the same way as Change File Link:
3.1 Check to see if the Fixed file is already in the DB;
3.2 If yes, grab its GUID
3.3 If no, add to the DB and grab GUID
6.1 If found, replace found entries with Fixed GUID
I'm like 90% certain this could be done through an add-on, unless there's scripting limitations on what meta can be copied or what changes can be made to ComicDB.xml? I'm no expert though, and it's beyond my current abilities but I'm trying to poke around and learn what I can.
Either way, replacing files is not uncommon within the digital comic world, be it (F)ixed files or changing from an old C2C to a NoAds or Digital with better quality, so a way to seamlessly replace a book across the whole Library with a better copy would be a welcome addition (at least, for me). Thanks for reading all this, and I'm looking forward to seeing what people come up with (but pls no "Just use SmartLists" thnx). 😁
Beta Was this translation helpful? Give feedback.
All reactions