-
Notifications
You must be signed in to change notification settings - Fork 2
Description
I am observing some problems with the script's duplicate detection when it comes to the JPG derivatives, either because conversion does not always produce exactly identical outputs, or because some previous bot runs were using slightly different settings, (or it's just not looking for duplicates of the JPGs after a conversion from TIFF).
This is fine when the names are the same, and overwriting an older version of a file with another does no harm (or even is good, because it enforces consistency).
But when the bot is also moving page names at the same time, it will move the TIFF, leave the old JPG in place because it did not detect it, and then upload a new JPG (now not linked form the TIFF), creating a bit of a mess.
Example:
https://commons.wikimedia.org/wiki/File:Grand_Canyon._Same_locality_as_433._Old_Nos._470,_473,_500_-_NARA_-_517801.tif
https://commons.wikimedia.org/wiki/File:Grand_Canyon._Same_locality_as_433._Old_Nos._470,_473,_500_-_NARA_-_517801.jpg
https://commons.wikimedia.org/wiki/File:Grand_Canyon._Same_locality_as_433._Old_Nos._470,_473,_500,_1871_-_1878_-_NARA_-_517801.jpg