Replies: 1 comment
-
This just covers a small portion of what you brought up, but you can change the |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Unfortunatell, a lot of sites host duplicated files with different filenames that end up being downloaded with gallery-dl. I usually do a manual cleanup with fdupes on these things, it free up a lot of space, however i end up risking running the scrit again in the future and downloading the duplicate files yet again.
To give a pratical exemple of this problem, try to download and later on update belledelphine page on coomer; a large portion of it is composed by duplicated files.
So here what i think it would be interesting; the first task gallery-dl should do is to see if the filename is already saved; if it isnt the case, gallery-dl could then check the hashes before downloading (or at least saving the archive) to see if there is already some file with the exact hash of it; If this was the case, it could instead of donwloading/saving the file, just create an simlink to the exact match of the file you saved prior. Next time you had to run the script on the same parameters, it would detect the simlink and skip both the original hash file and the duplicated version with the simlink. It will add some extra steps but saves a lot of bandweight.
Beta Was this translation helpful? Give feedback.
All reactions